23  Phenology: Exploring patterns

Learning Objectives

After completing this tutorial you should be able to

Download the directory for this project here, make sure the directory is unzipped and move it to your bi328 directory. You can open the Rproj for this module either by double clicking on it which will launch Rstudio or by opening Rstudio and then using File > Open Project or by clicking on the Rproject icon in the top right of your program window and selecting Open Project.

There should be a file named 23_Phenology.qmd in that project directory. Use that file to work through this tutorial - you will hand in your rendered (“knitted”) quarto file as your homework assignment. So, first thing in the YAML header, change the author to your name. You will use this quarto document to record your answers. Remember to use comments to annotate your code; at minimum you should have one comment per code set1 you may of course add as many comments as you need to be able to recall what you did. Similarly, take notes in the document as we discuss discussion/reflection questions but make sure that you go back and clean them up for “public consumption”.

  • 1 You should do this whether you are adding code yourself or using code from our manual, even if it isn’t commented in the manual… especially when the code is already included for you, add comments to describe how the function works/what it does as we introduce it during the participatory coding session so you can refer back to it.

  • Before you get started we are going to need an additional set of packages that we will want to install first.

    # install tidymodel packages
    install.packages("tidymodels")

    Now, let’s make sure to read in all the packages we will need.

    # load libraries ----
    
    # reporting
    library(knitr)
    
    # visualization
    library(ggplot2)
    library(ggthemes)
    library(patchwork)
    
    # data wrangling
    library(dplyr)
    library(tidyr)
    library(readr)
    library(skimr)
    library(janitor)
    library(magrittr)
    library(lubridate)
    
    # modelling
    library(tidymodels)
    
    # set other options ----
    options(scipen=999)
    
    knitr::opts_chunk$set(
      tidy = FALSE, 
      message = FALSE,
        warning = FALSE)

    23.1 Introduction to phenology

    Consider this

    Briefly define the terms phenology, phenophases, life history and life history traits and argue how you think climate change might impact these.

    Did it!

    [Your answer here]

    We are going to explore a data set that contains information on the phenology of Minnesota Species to determine to which extent seasonal events are tied to climate or whether they are dependent on other factors. To do this we are going to use a data set generated by the Minnesota Phenology Network.

    Consider this

    Go to the Minnesota Phenological Network’s website and explore the short introductory and history statements in the “Home” and “About” sections. Check out the “Meet the species” page that presents Minnesota’s seven superstar species. Get to know them in the “Read more” sections.

    Pick two of the seven superstars and list the specific phenophases that associated with the species.

    Did it!

    [Your answer here]

    23.2 Explore the data set

    We are going to use data from the Minnesota Phenology Network to explore how climate change might be impacting the phenology of species.

    pheno <- read_delim("data/mnpn_master_dataset_2018.v2.txt", delim = "\t") %>%
      clean_names()

    Let’s get an overview of the data set.

    Give it a whirl

    Use your exploratory analysis skills to get an idea of the dimensions of the data set, what variables are contained in the data set and what data types they are. What function can you use?

    Write a brief description of the data set.

    Did it!

    [Your answer here]

    That’s right - skim() will give you all the information in one handy place!

    skim(pheno)

    Table 23.1: Data overview MN phenology network data set.

    (a) Data summary
    Name pheno
    Number of rows 54741
    Number of columns 12
    _______________________
    Column type frequency:
    character 9
    numeric 3
    ________________________
    Group variables None

    Variable type: character

    skim_variable n_missing complete_rate min max empty n_unique whitespace
    day 0 1.00 1 10 0 4329 1
    event 1 1.00 4 36 0 120 0
    species_common_name 136 1.00 3 48 0 1827 0
    genus 594 0.99 3 15 0 748 0
    species 1566 0.97 2 24 0 1089 0
    county 83 1.00 2 17 0 43 0
    lifeform 1 1.00 6 7 0 3 0
    group 136 1.00 4 22 0 17 0
    invasive 53339 0.03 3 3 0 1 0

    Variable type: numeric

    skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
    year 0 1 1997.56 16.53 1941 1986 2004 2011 2017 ▁▂▂▂▇
    dataset 0 1 6.56 4.46 1 2 7 12 13 ▇▃▅▁▇
    day_of_year 84 1 173.95 58.82 2 130 164 211 366 ▁▇▇▃▁

    You do still have to write your description though …

    One thing you may have noticed is that we have a column day that contains the date - however, currently it is formatted as a character. This could cause issues down the line when we want to plot things.

    Dates are notoriously difficult to deal with2. A useful package to deal with dates is lubridate.

  • 2 Remember when we listed out the many, many, many ways we can write a date and how conventions might differ between fields and countries?

  • Let’s use functions from that package to create column called date that has the data type date, and while we are at it, we can also learn how to create new columns with the month and day.

    pheno <- pheno %>%
      mutate(date = mdy(day),            # converts character in format month day year to date
             month = month(date),        # extract month
             day = day(date))            # extract day
    
    
    # check class
    class(pheno$date)
    [1] "Date"

    Now let’s get an idea on what data is contained in the data set.

    Give it a whirl

    Describe how you can use the skimr output to determine how many unique entries (categories) we have for our categorical variables. Then use a function to print life forms, groups of species, and counties are contained in the data set to the console/your report.

    Did it!

    [Your answer here]

    The dplyr verb that can help you out here is distinct() or you could apply unique() to the column itself, which is a vector.

    # output unique entries as dataframe
    pheno %>%
      distinct(lifeform)
    # A tibble: 4 × 1
      lifeform
      <chr>   
    1 PLANTS  
    2 ANIMALS 
    3 <NA>    
    4 ABIOTIC 
    # output unique entries as vector
    unique(pheno$lifeform)
    [1] "PLANTS"  "ANIMALS" NA        "ABIOTIC"
    Give it a whirl

    Create a table that shows the number of species in each group of species.

    Did it!

    [Your answer here]

    Let’s focus on woody plants for now.

    Give it a whirl

    Create a new object called woody that contains only woody plants.

    Did it!

    [Your answer here]

    Give it a whirl

    Determine what events are recorded for species in this category. Create an overview table with events organized alphabetically.

    Did it!

    [Your answer here]

    That’s a lot of events. Let’s try to get an idea of when these different events occur throughout the year.

    Give it a whirl

    Create a figure that effectively summarizes when different events typically occur throughout the year.

    Did it!

    [Your answer here]

    Since we are interested in distributions, you would want to look at histograms or box plots - this is one way your figure could look like:

    Figure 23.1: Distribution of life history events throughout the year for Minnesota woody plants.

    23.3 Formulate a specific question

    Since we are interested in whether climate change is impacting phenology, let’s chose a specific event that we think is likely to be linked to changes in temperature and could be a good indicator of changing phenology.

    Consider this

    Pick three events that you think could be good indicator events to look at and argue why you think they would be interesting to explore.

    Did it!

    [Your answer here]

    Let’s start with the flowering date.

    Give it a whirl

    Create a subset called flowering that contains only records of flowering dates for woody plants. How could you plot this data to determine whether the flowering data has changed over time?

    Describe how you expect the pattern to look like if the flowering date is occurring earlier, later, or not changing over time.

    Plot the data and then use your predicted patterns to assess whether the flowering date is changing over time.

    This is what you want your figure to look like:

    Figure 23.2: Change in day of flowering for woody plants in Minnesota 1941 - 2018. Fill of individual points indicates the month of the flowering date; linear trendline is fitted in red.

    Let’s consider how useful this visualization is for answering our question. We combined all the woody species from all of Minnesota in the same plot. We probably don’t expect all the woody plants to react to changes in the same way or for that response to occur at the same pace. We should also consider that there could be differences based on the geographic location.

    In this case, it might be more effective to narrow our question and pull out a specific species. Let’s use the American Elm (Ulmus americanus). This is a deciduous species with a geographic range throughout most of the eastern US and southeast Canada, i.e. Minnesota is at the northern limit of its range. Overall, this is a pretty hardy species that will grow to a considerable size and is frequently found in Urban settings. Like all elm species in Minnesota it is susceptible to the Dutch elm disease which caused by an invasive fungal pathogen.

    Consider this

    Create a new object called elm that contains only entries for the American Elm.

    Did it!

    [Your answer here]

    Since the American Elm is so widespread, let’s also consider that we might want to eliminate geography as a confounding pattern.

    Give it a whirl

    Determine whether we have data for more than one county. Use that information to determine whether you want (need) to narrow down your data set. Explain your choice.

    Did it!

    [Your answer here]

    It appears we have found our question!

    Consider this

    State the specific question we are asking and give a brief description of the data you will need and how you will analyze it to answer that question.

    Did it!

    [Your answer here]

    Just so we are all in agreement - our question is: Has climate change changed the phenology of American Elm in Ramsey County, Minnesota?

    Do we already have all the data we need to answer that question?

    23.4 Determine changes in the flowering date of American elm in Ramsey County, MN

    Our first step will be determining whether or not there has been a change in the flowering data of the American Elm.

    Consider this

    Make a prediction of how you think the flowering date of the American Elm may have changed over the 50 year time span recorded in our data set. Describe what your figure should look like if you are indeed correct, be specific and argue why you are expecting this pattern.

    Did it!

    [Your answer here]

    Give it a whirl

    Filter your data so that it only contains entries for Ramsey County.

    Plot your data to determine if your prediction was correct; add a simple linear trend line to your plot to help identify the overall pattern.

    Make sure to give your visualization a title and add a legend3.

  • 3 You can do this using the chunk options (fig-cap) or directly on the figure itself

  • Did it!

    [Your answer here]

    This is what your figure should look like.

    elm <- elm %>%
      filter(county == "RAMSEY")
    
    ggplot(elm, aes(x = year, y = day_of_year, fill = month)) +
      geom_point(shape = 21, size = 3) +
      geom_smooth(method = "lm", color = "red") +
      scale_fill_viridis_c() +
      labs(x = "year", y = "flowering date") +
      theme_standard +
      theme(legend.position = "bottom")
    Give it a whirl

    Determine the rate of change (remember to include units!).

    Did it!

    [Your answer here]

    Hint: You will need to determine the equation of your linear trend line to do this.

    
    Call:
    lm(formula = day_of_year ~ year, data = elm)
    
    Residuals:
         Min       1Q   Median       3Q      Max 
    -21.7698  -5.7744  -0.6021   8.5442  19.2974 
    
    Coefficients:
                 Estimate Std. Error t value Pr(>|t|)  
    (Intercept) 258.27107  146.61188   1.762   0.0836 .
    year         -0.07927    0.07439  -1.066   0.2912  
    ---
    Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    
    Residual standard error: 10.81 on 56 degrees of freedom
    Multiple R-squared:  0.01987,   Adjusted R-squared:  0.002371 
    F-statistic: 1.135 on 1 and 56 DF,  p-value: 0.2912

    Don’t get too hung up on whether the result is significant or not. The rate is still the rate and you will want to include that in your results.

    Consider this

    Describe your overall results4.

    For your interpretation/discussion identify at least two possible mechanisms that could be causing this pattern. You will likely notice that you have a low R2 value and the linear regression is not significant. What does that mean in this context? How can you apply that to generating possible mechanisms for the pattern?.

  • 4 Remember, results are more than just figures! Practice being specific, i.e. consider how much the flowering data has shifted and at what rate

  • Did it!

    [Your answer here]

    23.5 Determine temperature change in Ramsey County

    If we want to investigate whether there is a relationship between climate change and change in the flowering date, we need climate data for Ramsey Country for the same time period.

    Consider this

    Describe what your climate data set should look like and argue why you want to include certain variables. Describe the pattern you would expect to see in your climate data over time to support your hypothesis of what is driving the pattern you uncovered in the phenology data set. Be specific.

    Did it!

    [Your answer here]

    Consider what the ideal temporal and spatial resolution would need to be to “match” your phenology data.

    You ask and you shall receive:

    Go to the DNR Minnesota Climate Trends website to access historical observations for specific Minnesota locations. Use the drop down menus options to select subsets using the check boxes to create a data set that meets the specifications you set out above.

    Unfortunately, the download button doesn’t work. Instead, you should cut and paste the data into a text editor or excel and save it as a tab-delimited file in your data folder. Name it MN_temp.txt.

    Lets read in your temperature data. Unless you added headers in the text file it’s missing them. Similar to the way we can use skip if there are additional lines, we have an argument we can use to specify column names if they are missing.

    temp <- read_delim("data/MN_temp.txt", delim = "\t", col_names = c("year", "temperature"))
    Consider this

    Plot your data to determine if your prediction was correct; add a simple linear trend line to your plot to help identify the overall pattern.

    Give your visualization a title and legend and describe your results. Be specific (i.e. don’t just state the overall trend, consider how much temperature has changed). For your interpretation/discussion consider how the data differs from your predictions and determine whether it supports your hypothesis of the mechanism.

    Did it!

    [Your answer here]

    Here’s what your figure could look like:

    Figure 23.3: Change in mean March temperature [Fahrenheit] for 1941 - 1991 (blue) and linear trendline (red).

    I used a linegraph, which is an option for visualizing change over time as we are doing here in this time series. Using a line graph instead of a scatterplot is appropriate when plotting a categorical data on the x axis, similarly as we have discussed a line plot is an appropriate visualization of a time series.

    We can add both a trendline and a line connecting individual points. This can be helpful because the linear trendline shows us the overall trend of the data while the line connecting the individual points helps us visualize local change from point to point.

    For categorical data we can also use barplots:

    Figure 23.4: Change in mean March temperature [Fahrenheit] for 1941 - 1991 (orange bars).

    Why do you think it is better to use a scatter plot or line plot compared to the barplot?

    Give it a whirl

    For a good presentation of your results, e.g. through a presentation or poster, you would likely want to produce a figure where you have change in flowering date and change in mean temperature for March side by side. Practice making that type of figure here:

    Did it!

    [Your answer here]

    23.6 Analyze relationship of flowering date and temperature

    Give it a whirl

    Especially when we look at the two figures side by side the fact we see that there is a trend for earlier flowering dates for American elm and warmer March temperatures. Describe the analysis you want to perform to determine whether there is a significant relationship between the mean temperature in March and the flowering date for the American Elm in Ramsey Co.. Include what your dependent and independent variables are5.

    Create a plot to visualize the relationship but hold off on the analysis component until the next step.

  • 5 Your are practicing writing up your methods. A good methods section includes not just what you want to do but why you are doing it/what that analysis is meant to achieve. A good way to start for this specific example would be e.g. “To determine whether {variable} depends on {variable}, we will {analysis}

  • Did it!

    [Your answer here]

    To be able to plot temperature vs flowering date those two variables will need to be in the same data set. You can combine them using left_join().

    Here’s what your figure could look like:

    Figure 23.5: Relationship of mean temperature in March and flowering date for American Elm in Ramsey Co., MN.

    You’ve already learned how to fit linear regressions using the function lm(). As you can imagine, there is a different function for various models (and occasionally multiple functions for the same type of models) and each comes with a slightly different syntax. It also might be more helpful to have the output e.g. in a data.frame or tibble.

    There is a group of packages that has been designed to make interacting with models more consistent and processing the output in a more user-friendly way. You can install and load them as a group by calling tidymodels. From the name you have probably guessed that these packages have been designed to play well with the tidyverse.

    Here’s how you can run a linear regression using parsnip which is the package designed to fit models using a tidy, unified interface. Essentially, in provides an interface to use different models behind the scene. This means that you can use a consistent syntax instead of having to figure it out for each model. You will see that having already used lm() without the interface that you are already familiar with the formula syntax that is used here.

    The first step will be loading the tidymodels packages and specifying the model we want to use, in this case that would be a linear regression model.

    linear_reg()                                      # specify model
    Linear Regression Model Specification (regression)
    
    Computational engine: lm 

    That’s not too exciting, because after we specify the type of model we need to tell R which engine (method) it should use to fit the model. For example, to use an ordinary least square regression, we would set the method to lm6.

  • 6 You could use the documentation page for linear_reg() to list all the possible engines.

  • linear_reg() %>%                                  # specify model
      set_engine("lm")                                # Define computational engine
    Linear Regression Model Specification (regression)
    
    Computational engine: lm 

    It appears we are getting somewhere, the last things we still need to do is use the function fit() to tell it what variables and we want to use to train/fit the model. We use the same formula syntax you are familiar with from using lm(). We will also use the function tidy() to convert the output into a handy tibble/data.frame.

    linear_reg() %>%                                  # specify model
      set_engine("lm") %>%                            # Define computational engine
      fit(day_of_year ~ temperature, data = elm) %>%  # define variables
      tidy()
    # A tibble: 2 × 5
      term        estimate std.error statistic  p.value
      <chr>          <dbl>     <dbl>     <dbl>    <dbl>
    1 (Intercept)   148.       4.60       32.1 1.51e-34
    2 temperature    -1.57     0.155     -10.1 1.35e-13

    This doesn’t look too different from the output you are used to seeing but we will see down the line that there are additional benefits to using the tidymodels framework when it comes to doing more sophisticated analysis.

    We’ve previously used regressions, mainly because were interested in the slope to be able to calculate a rate of increase or decrease over time. Now we are interested in a relationship between two continuous variables and whether or not one of the (the independent value, in our case temperature) has significant explanatory power for the dependent variable (in this case the flowering date). So let’s think about how we should interpret the results of our regression.

    The (Intercept) term (slope) tells us that for temperature = 0 Fahrenheit, the expected flowering date would be 148. The temperature term tells us that on average for every one degree Fahrenheit decrease in Temperature, we would expect the flowering date to occur 1.57 days earlier.

    Consider this

    After first thinking about the purely mathematical explanation, we would want to think about what is biologically meaningful/sensible. Is there part of our interpretation above that biologically doesn’t entirely make sense? What does this tell you about extrapolating beyond the reach of your data?

    Did it!

    [Your answer here]

    Consider this

    Discuss your results, make a conclusion about the relationship between American elm flowering data and mean March temperature and then discuss that in the context of our original question.

    Did it!

    [Your answer here]

    Consider this

    Especially if you are writing a paper or report in the style of “Introduction/Methods/Results/Discussion”, a good discussion starts by reiterating the original (broad) question you asked with short description of how you specifically investigated it, your key results, and then interpreting/discussing those.

    You can essentially follow a fill-in-the-blank-formula, which as you become more comfortable communicating your research will sound a little less formulaic and a little bit more you; adapt the following for your answer:

    In this study, I investigated [broad question asked/hypothesis tested]. To achieve this, I [specific data set + analysis used]. We found that [key results: in your case you would make a specific statement about the trend of earlier flowering dates, the trend of warmer March temperatures, and the relationship of temperature and flowering date].

    And then you would follow this with a brief discussion of whether your results and how they apply to your initial question, i.e. we are asking whether climate change will impact the phenology of plants - how does this specific example apply to that question?

    Did it!

    [Your answer here]

    23.7 Now you!

    We asked a pretty general question to start us off with and used a specific species and a specific phenophase to investigate. To gather further evidence or determine if there are other patterns to observe we would want to investigate the phenology of further organisms.

    Good thing we have a very large data set to work with!

    Consider this

    Go back to the original data set with all the phenology records (object pheno). To find a good species to investigate we will want to make sure that we have sufficient data to make a meaningful statement about changes in the timing of phenophases.

    Use your data wrangling skills to identify all species in the data set with at least 30 years of data for a specific phenophase and locations (i.e. you want at least 20 entries over at least 20 years).

    Did it!

    [Your answer here]

    You need to generate a table that looks like this - bonus, organize your table group of species, species, and by years of data available.

    group species_common_name event county min_year max_year time_observed n_observations
    AMPHIBIANS & REPTILES SPRING PEEPER FIRST HEARD ITASCA 1984 2016 32 35
    AMPHIBIANS & REPTILES WOOD FROG FIRST HEARD ITASCA 1984 2016 32 40
    BIRDS AMERICAN ROBIN ARRIVAL SHERBURNE 1975 2011 36 33
    BIRDS AMERICAN ROBIN FIRST FLOCK OF MIGRATORS ITASCA 1984 2016 32 36
    BIRDS AMERICAN ROBIN LAST SEEN ITASCA 1984 2014 30 56
    BIRDS BLUE WINGED TEAL ARRIVAL SHERBURNE 1980 2013 33 34
    BIRDS BUFFLEHEAD ARRIVAL SHERBURNE 1982 2012 30 30
    BIRDS CANADA GOOSE ARRIVAL SHERBURNE 1975 2011 36 31
    BIRDS COMMON GOLDENEYE ARRIVAL SHERBURNE 1982 2012 30 30
    BIRDS COMMON MERGANSER ARRIVAL SHERBURNE 1979 2012 33 33
    BIRDS DARK EYED JUNCO FIRST SEEN ITASCA 1984 2016 32 30
    BIRDS EASTERN BLUEBIRD ARRIVAL SHERBURNE 1975 2013 38 30
    BIRDS GREAT BLUE HERON ARRIVAL SHERBURNE 1975 2013 38 33
    BIRDS GREEN WINGED TEAL ARRIVAL SHERBURNE 1979 2013 34 32
    BIRDS HOODED MERGANSER ARRIVAL SHERBURNE 1982 2013 31 31
    BIRDS KILLDEER ARRIVAL SHERBURNE 1975 2013 38 34
    BIRDS LESSER SCAUP ARRIVAL SHERBURNE 1982 2012 30 30
    BIRDS MALLARD ARRIVAL SHERBURNE 1975 2011 36 32
    BIRDS NORTHERN HARRIER ARRIVAL SHERBURNE 1979 2012 33 30
    BIRDS NORTHERN PINTAIL ARRIVAL SHERBURNE 1979 2013 34 33
    BIRDS NORTHERN SHOVELER ARRIVAL SHERBURNE 1979 2012 33 33
    BIRDS PIED BILLED GREBE ARRIVAL SHERBURNE 1982 2013 31 30
    BIRDS RED WINGED BLACKBIRD ARRIVAL SHERBURNE 1977 2013 36 36
    BIRDS RING NECKED DUCK ARRIVAL SHERBURNE 1981 2012 31 32
    BIRDS RUFFED GROUSE FIRST COURTSHIP/TERRITORIAL BEHAVIOR ITASCA 1985 2016 31 48
    BIRDS SANDHILL CRANE ARRIVAL SHERBURNE 1980 2013 33 33
    BIRDS WOOD DUCK ARRIVAL SHERBURNE 1979 2013 34 35
    FORB BELLWORT FLOWERING HENNEPIN 1957 1991 34 32
    FORB BLOODROOT FLOWERING HENNEPIN 1957 1991 34 32
    FORB BLOODROOT LAST FLOWER HENNEPIN 1957 1991 34 33
    FORB BRIDAL WREATH FLOWERING RAMSEY 1941 1991 50 48
    FORB CANADA VIOLET FLOWERING HENNEPIN 1959 1991 32 30
    FORB CAROLINA PUCCOON FLOWERING HENNEPIN 1960 1991 31 31
    FORB COMMON MILKWEED FLOWERING ITASCA 1984 2016 32 31
    FORB COMMON SAINT JOHN’S WORT FLOWERING HENNEPIN 1960 1991 31 40
    FORB CROWFOOT FLOWERING HENNEPIN 1961 1991 30 30
    FORB CUT-LEAVED TOOTHWORT FLOWERING HENNEPIN 1960 2001 41 38
    FORB DANDELION FLOWERING ITASCA 1985 2016 31 33
    FORB FALSE RUE ANEMONE FLOWERING HENNEPIN 1960 1991 31 30
    FORB FALSE SOLOMON’S SEAL FLOWERING HENNEPIN 1960 1990 30 30
    FORB FIREWEED FLOWERING ITASCA 1984 2016 32 31
    FORB GOLDEN RAGWORT FLOWERING HENNEPIN 1960 1991 31 37
    FORB LARGE-FLOWERED TRILLIUM FLOWERING HENNEPIN 1958 1991 33 32
    FORB LARGE-FLOWERED TRILLIUM LAST FLOWER HENNEPIN 1958 1991 33 32
    FORB MARSH MARIGOLD FLOWERING HENNEPIN 1959 1991 32 31
    FORB MINNESOTA TROUT-LILY FLOWERING HENNEPIN 1960 1991 31 30
    FORB PURPLE TRILLIUM FLOWERING HENNEPIN 1957 1991 34 33
    FORB PURPLE TRILLIUM LAST FLOWER HENNEPIN 1957 1991 34 30
    FORB RUE ANEMONE FLOWERING HENNEPIN 1957 1991 34 33
    FORB RUE ANEMONE LAST FLOWER HENNEPIN 1957 1991 34 30
    FORB SHARP-LOBED HEPATICA FLOWERING HENNEPIN 1957 1991 34 33
    FORB SHARP-LOBED HEPATICA LAST FLOWER HENNEPIN 1958 1991 33 32
    FORB SHOOTING STAR FLOWERING HENNEPIN 1959 1990 31 31
    FORB SHOWY LADY’S SLIPPER FLOWERING HENNEPIN 1960 1991 31 30
    FORB SKUNK CABBAGE FLOWERING HENNEPIN 1957 1992 35 34
    FORB SKUNK CABBAGE LAST FLOWER HENNEPIN 1959 1991 32 31
    FORB SNOW TRILLIUM FLOWERING HENNEPIN 1957 1991 34 34
    FORB SNOW TRILLIUM LAST FLOWER HENNEPIN 1959 1991 32 31
    FORB SPREADING DOGBANE FIRST FALL COLOR ITASCA 1985 2016 31 31
    FORB STARFLOWER FLOWERING ITASCA 1984 2016 32 36
    FORB SWAMP BUTTERCUP FLOWERING HENNEPIN 1960 1991 31 31
    FORB SWAMP MILKWEED FLOWERING HENNEPIN 1960 1991 31 30
    FORB WHITE TROUT-LILY FLOWERING HENNEPIN 1960 1991 31 30
    FORB YELLOW TROUT-LILY FLOWERING HENNEPIN 1960 1991 31 30
    MAMMALS WHITE TAILED DEER FIRST ANTLERS ITASCA 1984 2016 32 72
    WOODY AMERICAN ELM FLOWERING RAMSEY 1941 1991 50 51
    WOODY AMERICAN ELM LEAF BUDBREAK RAMSEY 1941 1991 50 51
    WOODY AMERICAN TAMARACK ALL LEAVES COLORED (GROUP) ITASCA 1984 2016 32 34
    WOODY AMERICAN TAMARACK LEAF BUDBREAK ITASCA 1984 2016 32 45
    WOODY APPLE FLOWERING RAMSEY 1941 1991 50 51
    WOODY APPLE LAST FLOWER RAMSEY 1941 1991 50 51
    WOODY APPLE LEAF BUDBREAK RAMSEY 1941 1991 50 51
    WOODY BEAKED HAZELNUT FIRST POLLEN VISIBLE ITASCA 1984 2016 32 39
    WOODY BIG TOOTHED ASPEN LEAF BUDBREAK ITASCA 1985 2016 31 36
    WOODY BUR OAK LEAF BUDBREAK RAMSEY 1941 1991 50 51
    WOODY COMMON LILAC FLOWERING ITASCA 1985 2016 31 31
    WOODY LILAC FLOWERING RAMSEY 1941 1991 50 51
    WOODY LILAC FULL FLOWERING RAMSEY 1941 1991 50 50
    WOODY LILAC LEAF BUDBREAK RAMSEY 1941 1991 50 51
    WOODY PIN CHERRY [FIRE C, BIRD C] FLOWERING ITASCA 1984 2015 31 30
    WOODY PUSSY WILLOW FLOWERING ITASCA 1984 2016 32 70
    WOODY QUAKING ASPEN LEAF BUDBREAK RAMSEY 1941 1991 50 50
    WOODY RED ELDERBERRY FLOWERING RAMSEY 1941 1991 50 50
    WOODY RED ELDERBERRY LEAF BUDBREAK RAMSEY 1941 1991 50 51
    WOODY RED MAPLE COLORED LEAVES ANOKA 1984 2017 33 66
    WOODY SILVER MAPLE FLOWERING RAMSEY 1941 1991 50 51
    WOODY SPECKLED ALDER [HOARY A] FLOWERING ITASCA 1985 2016 31 35
    WOODY TAMARACK COLORED NEEDLES ANOKA 1984 2017 33 39
    WOODY TREMBLING ASPEN LEAF BUDBREAK ITASCA 1984 2016 32 49
    Table 23.2: Minnesota species with at least 30 years of data for a specific phenophase in the Minnesota Phenology database.

    We have already looked at a tree - let’s see if we can broaden the scope of the organisms we investigate and chose examples from different groups of species. For efficiency, we’ll divvy up the work and have everyone chose a different species.

    Consider this

    Call dibs on the species you would like to investigate. Do a tiny-google and write a 3-5 sentences description of your species. List the phenological questions you could pose about the species you have chosen based on your data set7.

  • 7 If you were writing a report/paper this would be part of your introduction/background section.

  • Did it!

    [Your answer here]

    We are going to practice putting together all the components of a data science-esque analysis.

    Data Science Process (H.Wickham & Grolemund: R for Data Science
    Give it a whirl

    Chose the phenophase you want to investigate and go through the process of “Transform-Visualize-Analyze/Model” + “Communicate”.

    1. Transform: Create a subset of your data set that contains only the data for the species, phenophase, and location you have chosen.
    2. Visualize: Plot the change in the timing of the phenophase you have chosen over time.
    3. Model/Analyze: Calculate the rate of change using a linear regression.
    4. Import/Tidy/Transform: Determine what the appropriate temperature data is to match your phenophase data8, the download, and import it.
    5. Visualize: Plot the change in temperature over time.
    6. Model/Analyze: Calculate the reate of change using a linear regression.
    7. Transform: Combine the phenophase and temperature data.
    8. Visualize: Plot the relationship of temperature & phenophase
    9. Model/Analyze: Determine if the relationship of temperature/phenophase is significant.
  • 8 Remember this will include both matching the geographic location & what time of year the temperature should be from, and time span to compare match the years in your data set

  • Describe what you are doing as you go, i.e. describe your methods9.

    Create a final multi-panel figure of your three visualizations and share it with your classmates in our slack channel along with your results (be specific.)

  • 9 Pro Tip: Use the instructions as your starting point, for some of them you would want to add a detail or two specific to your analysis.

  • Did it!

    [Your answer here]

    Consider this

    Collect the results from all the species/phenophases we have analyzed, this includes the American Elm, your species and those of your classmates and discuss the overall results.

    • restate the overall broad question/central hypothesis
    • summarize in 2-3 sentences what data set you used (include all the species/phenophase analysis we’ve done as a class) and how your analyzed it.
    • summarize the results - which (if any) phenophase now occur earlier/later/have not changed? Which (if any) phenophases are significantly correlated with temperature? How are you reaching these conclusions?
    • discuss your results - what mechanisms are consistent with the patterns your have observed? Make sure to connect your final conclusion(s) back to your initial question/hypothesis.
    Did it!

    [Your answer here]

    Consider this

    A final question we should consider is what the potential impacts of changing phenologies could be. Consider that species within a biological community might be differentially impacted by climate change and that some might be directly impacted by climate change resulting in an altered phenology, while other might be indirectly impacted because of an altered phenology for a species they closely interact with (this is called a population asynchrony or phenological mismatch).

    Describe how you could set up an analysis to to test whether one of the species we analyzed might experience a phenological mismatch. Your description should include what how what data you would need and how you would design your analysis.

    Did it!

    [Your answer here]

    23.8 Acknowledgments

    These activities are based on the EDDIE Phenology Trends and Climate Change in Minnesota module.10

  • 10 Freeman, P. (2021). Phenology Trends and Climate Change in Minnesota (Project EDDIE).